---
title: Installation and Setup
description: "Learn how to install, set up and configure OM1 on your robots."
---

## System Requirements

### Operating System

- macOS 12.0+
- Linux (Ubuntu 20.04+)

### Hardware

- Memory (RAM): 4GB
- Storage: 8GB
- Camera, speakers, microphone etc as robots sensors

### Software

- Python 3.10+
- uv 0.6.2
- [Openmind API key](https://portal.openmind.org/)

### Prerequisites

Ensure you have the following installed on your machine:

- uv for Python package manager
- ffmpeg for video processing
- portaudio for audio input and output

#### Package Manager

```bash Mac
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
```

``` bash Linux
sudo apt-get update && sudo apt-get upgrade -y
``` 

#### UV (A Rust and Python package manager)

``` bash Mac
brew install uv
```

``` bash Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
```

#### PortAudio Library

This will let you speak to the LLM and it will generate voice outputs. On Mac and Linux, you need `portaudio`.

```bash Mac
brew install portaudio
```

```bash Linux
sudo apt-get update
sudo apt-get install portaudio19-dev python-all-dev
```

#### ffmpeg

FFmpeg is the leading multimedia framework, able to decode, encode, transcode, mux, demux, stream, filter and play pretty much anything that humans and machines have created.

```bash Mac
brew install ffmpeg
``` 

```bash Linux
sudo apt-get update
sudo apt-get install ffmpeg
``` 

### Installation and Setup

1. Clone the repository

Run the following commands to clone the repository and set up the environment:

```bash clone repo
git clone https://github.com/OpenmindAGI/OM1.git
cd OM1
git submodule update --init
uv venv
```

2. Set the configuration variables

Locate the `config` folder and add your Openmind API key in `/config/spot.json`. If you do not already have one, you can obtain a free access key at https://portal.openmind.org/.  _Note:_ Using the placeholder key **openmind-free** will generate errors.

```bash set api key
# /config/spot.json
...
"api_key": "om1_live_e4252f1cf005af..."
...
```

Or create a `.env` file in the $HOME directory and add the following:

```bash
OM_API_KEY=your_api_key_here
```

3. Run the Spot Agent

Run the following command to start the Spot Agent:

```bash
uv run src/run.py spot
```

4. WebSim to check input and output

Go to [http://localhost:8000](http://localhost:8000) to see real time logs along with the input and output in the terminal. For easy debugging, add `--debug` to see additional logging information.

**Congratulations!** - you just got started with OM1 and can now explore its capabilities.

Some necessary packages will be installed during this process, the first time you run the command. This might take a little time. Please be patient. Then you will see the system come to life:

```bash
Object Detector INPUT
// START
You see a person in front of you. You also see a laptop.
// END

AVAILABLE ACTIONS:
command: move
    A movement to be performed by the agent.
    Effect: Allows the agent to move.
    Arguments: Allowed values: 'stand still', 'sit', 'dance', 'shake paw', 'walk', 'walk back', 'run', 'jump', 'wag tail'

command: speak
    Words to be spoken by the agent.
    Effect: Allows the agent to speak.
    Arguments: <class 'str'>

command: emotion
    A facial expression to be performed by the agent.
    Effect: Performs a given facial expression.
    Arguments: Allowed values: 'cry', 'smile', 'frown', 'think', 'joy'

What will you do? Command: 

INFO:httpx:HTTP Request: POST https://api.openmind.org/api/core/openai/chat/completions "HTTP/1.1 200 OK"
INFO:root:OpenAI LLM output: commands=[Command(type='move', value='wag tail'), Command(type='speak', value="Hi there! I see you and I'm excited!"), Command(type='emotion', value='joy')]
```

#### Explanation of the log data

The response above provides insight into how the spot agent processes its environment and decides on its next actions.
  - First, it detects a person using vision.
  - Next, it decides on a friendly action (dancing and speaking).
  - Expresses emotions via facial displays.
  - Logs latency and processing times to monitor system performance.
  - Communicates with an external AI API for response generation.

Overall, the system follows a predefined behavior where spotting a person triggers joyful interactions, driven by the LLM-assisted decision-making process. 